IMAGE INFORMATION CONVERSION APPARATUS 
AND IMAGE INFORMATION CONVERSION METHOD 

BACKGROUND OF THE INVENTION 

This invention relates to an image information 
conversion apparatus and an image information conversion 
method, and more particularly to an image information 
conversion apparatus and an image information conversion 
method which are used to receive, through network media 
such as a satellite broadcast, a cable television 
broadcast or the Internet or process, on a recording 
medium such as an optical disk or a magneto - opt ical disk, 
image information in the form of a bit stream compressed 
by orthogonal transform such as discrete cosine transform 
and motion compensation. 

In recent years, an apparatus which complies with a 
method wherein image information is handled as digital 
data and the redundancy unique to image information is 
utilized to compress image information by orthogonal 
transform such as, for example, discrete cosine transform 
and motion compensation in order to allow transmission 
and storage of information with a high efficiency has 
been and is being popularized in both of information 
distribution from a broadcasting station or the like and 



information reception by general homes . 

Particularly, MPEG2 standardized by the MPEG 
(Moving Picture Experts Group) is defined as a general 
purpose image coding system in the ISO/IEC 13818-2 and 
covers both of interleaved scan images and progressive 
scan images as well as standard resolution images and 
high resolution images. Therefore, it is expected that 
the MPEG2 be used by wide varieties of applications from 
professional applications to consumer applications in the 
future. 

Where such an MPEG2 compression system as described 
above is used, realization of a high compression ratio 
and a good picture quality can be anticipated by 
allocating, to interleaved scan images of a standard 
resolution having, for example, 720 X 480 pixels, a code 
amount (hereinafter referred to as bit rate) of 4 to 8 
Mbps or by allocating, to interleaved scan images of a 
high resolution having, for example, 1,920 X 1,088 pixels 
a bit rate of 18 to 22 Mbps. 

The MPEG2 is directed to high picture quality 
coding suitable principally for broadcasting, but is not 
ready for a coding system of a bit rate lower than, that 
is, of a compression ratio higher than, that of the MPEG1 
However, from popularization of portable terminals, it 



has been expected that the need for a coding system of a 
higher compression ratio increase in the future. 
Therefore, the MPEG4 coding system has been standardized, 
and the image coding system of the MPEG4 was approved as 
international standards of the ISO/IEC 14496-2 in 
December 1998. 

In order to process MPEG2 image compression 
information (hereinafter referred to as MPEG2 bit stream) 
coded once so as to be suitable for digital broadcasting 
on a portable terminal or the like, it is demanded to 
convert the MPEG2 bit stream into MPEG4 image compression 
information (hereinafter referred to as MPEG4 bit stream) 
of a lower bit rate. 

An image information conversion apparatus 
(transcoder) which satisfies the demand is disclosed in 
Susie J. Wee, John G. Apos tlopoulos and Nick Feamster, 
"Field- to-Frame Transcoding with Spatial and Temporal 
Downsampling" , ICIP '99 (hereinafter referred to as 
document 1) . The image information conversion apparatus 
mentioned is shown in FIG. 4. 

Referring to FIG. 4, the image information 
conversion apparatus 100 shown includes a picture type 
discrimination section 101, an MPEG2 image information 
(I/P picture) decoding section 102, a reduction section 



103, a video memory 104, an MPEG4 image information (I/P- 
VOP) coding section 105, a motion vector synthesis 
section 106, and a motion vector detection section 107. 
It is to be noted that the VOP (Video Object Plane) in 
the MPEG4 corresponds to the frame in the MPEG2 . 

The picture type discrimination section 101 
receives data of frames of MPEG2 image compression 
information (hereinafter referred to as MPEG2 bit stream) 
of an interleaved scan as an input thereto and 
discriminates whether data of each frame is of MPEG2 
image information (hereinafter referred to as I/P picture 
which signifies an intra- image coded picture/f orward 
predictive coded picture) or of a B picture (bi- 
directionally predicted picture) . The picture type 
discrimination section 101 outputs only the former data 
to the MPEG2 image information decoding section 102 of 
the following stage. 

The MPEG2 image information decoding section 102 
executes processing similar to that of an ordinary MPEG2 
image information decoding section. However, since data 
regarding B pictures are discarded by the picture type 
discrimination section 101, only it is required for the 
MPEG2 image information decoding section 102 to have a 
function of decoding only I/P pictures. 




The reduction section 103 receives pixel values 
from the MPEG2 image information decoding section 102 and 
performs processing of reducing the pixel values to 1/2 
in the horizontal direction and discarding data of one of 
the first and second fields in the vertical direction 
while leaving data of the other field to produce a 
progressive scan image having a size of 1/4 that of the 

ta 

^ inputted image information. 

m If the MPEG2 bit stream inputted from the MPEG2 

In 

image information decoding section 102 represents images 

LU 

f R compliant with the standards of the NTSC (National 

Television System Committee), that is, interleaved scan 
C3 images of 720 X 480 pixels and 30 Hz, then the images 

: : ; 

£3 after the reduction by the reduction section 103 have a 

size of 360 X 240 pixels. However, in order to allow the 
processing in a unit of a macro block when the MPEG4 
image information coding section 105 in a succeeding 
stage performs coding, the pixel numbers both in the 
horizontal and vertical directions must be multiples of 
16. Accordingly, the reduction section 103 further 
performs supplementation or discarding of pixels for 
satisfying the requirement. In particular, in the 
specific case described above, eight lines, for example, 
at the right end or the left end in the horizontal 
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direction are discarded so that the image has a size of 
352 X 240 pixels. 

The progressive scan image produced by the 
reduction section 103 is stored into the video memory 104 
and then undergoes coding processing by the MPEG4 image 
information coding section 105, and is outputted as an 
MPEG4 bit stream. 

Motion vector information in the inputted MPEG2 bit 
stream is supplied to the motion vector synthesis section 
106, by which it is mapped to motion vectors for the 
image information after the reduction. 

The motion vector detection section 107 detects 
motion vectors of a high degree of accuracy based on the 
motion vector values synthesized by the motion vector 
synthesis section 106 . 

The image information conversion apparatus 100 
disclosed in document 1 produces an MPEG4 bit stream of 
progressive scan images having a size of 1/2 X 1/2 that 
of an inputted MPEG2 bit stream. For example, where the 
inputted . MPEG2 bit stream complies with the NTSC 
standards, the MPEG4 bit stream to be outputted has the 
SIF size (352 X 240 pixels) . The image information 
conversion apparatus 100 can convert the inputted MPEG2 
bit stream also into an image of any other image size, 



for example, the QSIF (176 X 112 pixels) size which is a 
size of approximately 1/4 X 1/4 in the example described 
above, by modifying the operation of the reduction 
section 103. 

Further, the image information conversion apparatus 
100 performs, as a process by the MPEG2 image information 
decoding section 102, a decoding process using all of 
eighth-order discrete cosine transform coefficients in 
the inputted MPEG2 bit stream for the horizontal and 
vertical directions or a decoding process using only low- 
frequency components from among eighth-order discrete 
cosine transform coefficients only for the horizontal 
direction or for both of the horizontal and vertical 
directions thereby to reduce the arithmetic operation 
amount for the decoding process and the video memory 
capacity while suppressing the picture quality 
deterioration to the minimum. 

In the image information conversion apparatus 100 
shown in FIG. 4, the code amount control of the MPEG4 
image information coding section 105 makes a significant 
factor of determination of the picture quality of an 
MPEG4 bit stream. In the ISO/IEC 14496-2, the system for 
code amount control is not specifically prescribed, and 
each vendor can use a system which is considered optimum 



from the point of view of the arithmetic operation amount 
and the output picture quality in accordance with an 
application to be used. In the following, a system 
prescribed in the MPEG2 Test Mode 15 (ISO/IEC 
JTC1/SC29/WG11 N0400) as a representative code amount 
control system is described. 

For the code amount control, bit distribution to 
each picture is performed as a first step using a target 
code amount (target bit rate) and a GOP (Group Of 
Pictures) configuration as input variables, and then rate 
control is performed using a virtual buffer, whereafter 
adaptive quantization for each macro block is performed 
finally taking a visual characteristic into consideration. 
The operation of the code amount control is illustrated 
in FIG. 5. 

Referring to FIG. 5, first in step S101, the MPEG4 
image information coding section 105 distributes an 
allocation bit amount for each picture in a GOP in 
accordance with a bit amount (hereinafter represented by 
R) to be allocated to those pictures which are not 
decoded as yet including allocation object pictures. This 
distribution is repeated in order of coded pictures in 
the GOP. In this instance, the code amount allocation to 
each picture is performed based on the following two 
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assumptions . 

First, it is assumed that the product of an average 
quantization scale code to be used for coding of each 
picture and the generated code amount is fixed for each 
picture type unless the screen does not change. Therefore, 
after each picture is coded, variables Xi, X p and Xb 
(global complexity measures) each representative of the 
^5 complexity of the screen are updated in accordance with 

^ the following expressions (1) to (3) for individual 

£n 

~h picture types: 



x p =s p q p (2) 

X b =S b Q b (3) 



where Si, S p and Sb are the generated code bit amounts upon 
picture coding, and Qi, Q p and Qb are average quantization 
scale codes upon picture coding. The variables Xi, X p and 
Xb have initial values represented by the following 
expressions (4) to (6) , respectively, using the target 
code amount (target bit rate) bit_rate [bits/sec] : 

X t . = 160 * bit _r ate/ 115 (4) 

X i = 60 * bit _ rate /1 15 ( 5 ) 

X, =42* bit _rate/ 115 (6) 

Secondly, it is assumed that the picture quality of 
the entire image is always optimized when the ratios K p 



# 
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and Kb of the quantization scale code of P and B pictures 
with reference to the quantization scale code of an I 
picture have values defined by the following expression 
(7) : 



K D =1.0;*, =1.4 



(7) 



In particular, the quantization scale code of a B 
picture is always 1.4 times that of the quantization 
scale codes of I and P pictures. Here, it is supposed 
that, by coding a B picture rather roughly than I and P 
pictures, if the code amount saved with a B picture is 
added to that of an I or P picture, then the picture 
quality of the I or P picture is improved, and also the 
picture of a B picture which refers to the I or P picture 
is improved . 

From the two assumptions specified as above, the 
allocation bit amounts (Ti, T p , Tb) to the different 
pictures of the GOP have values given by the following 
expressions (8) to (10), respectively: 



T. = max< 



R 



bit rate 



1+ N p' X p + N b ' X b ' picture _rate 



(8) 



T p = max 



R 



bit rate 



N b K p X b ' 8 x picture _ rate 



b p 



(9) 
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T fc = max 



bit rate 



N p -K b -X p *8x picture _rate 



(10) 



where N p and Nb are the numbers of P and B pictures which 
are not coded in the GOP as yet. 

Based on the allocation code amounts determined in 
this manner, each time a picture is coded in steps S101 
and S102, the bit amount R to be allocated to a non-coded 
picture in the GOP is updated in accordance with the 
following expression (11) : 



UJ 

CH R = R-S- pb (11) 

a 

M On the other hand, when the first picture in the 

O GOP is to be coded, the bit amount R is updated in 

rsr: 
Us? 

accordance with the following expression (12) : 



bit rate * N t 

/? = = + /? (12) 

picture _ rate 

where N is the number of pictures in the GOP. The initial 
value of the bit amount R at the start of a sequence is 0 

In step S102, in order to make the allocation bit 
amounts (Ti, T p , Tt>) to the pictures determined in 
accordance with the expressions (8) to (10) in step S101 
and actual generation code amounts coincide with each 
other, quantization scale codes are determined based on 
capacities of three different virtual buffers set 
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independently of each other for the individual pictures 
by feedback control in a unit of a macro block. First, 
prior to code of a j th macro block, the occupation 
amounts of the virtual buffers are determined in 
accordance with the following expressions (13) to (15) : 



*. = d* +B t - 7 * x ( / ( 13) 



MB _cnt 

df -df+B^ 1 (14) 

3 ° 71 MB_cnt 

m d)=d b 0 + B^ - r »*0-0 (15) 

MB cnt 

B — 

! s 

H where do 1 , do p and do b are the initial occupation amounts 

£0 of the virtual buffers, Bj is the generation bit amount 

from the top of the picture to the j th macro block, and 
MB_cnt is the number of macro blocks in one picture. The 
occupation amounts (dMB.cnt 1 , dMB_cnt p , dMB_cnt b ) of the virtual 
buffers upon ending of coding of the individual pictures 
are used as initial values (do 1 , do p , do b ) for the virtual 
buffer occupations for the next pictures. 

Then, the quantization scale code for the j th macro 
block is calculated in accordance with the following 
expression (16) : 

d, *31 

Qj=— (16) 
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where r is a variable called reaction parameter used to 
control the response of a feedback loop and given by the 
following expression (17): 

bit rate 

r = 2* = (17) 

picture _rate 

The initial values of the virtual buffers at the 
start of coding are given by the following expressions 
(18) to (20) : 

d[ = 10*— (18) 
31 

d p 0 =K p -dl (19) 

< =K b -dl (20) 

In step S103, the quantization scale codes 
determined in step S102 are modified with a variable 
called activity for each macro block so that they may be 
quantized finely at a flat portion at which deterioration 
can be visually observed comparatively conspicuously but 
may be quantized roughly at a complicated pattern portion 
at which deterioration can be visually observed 
comparatively less conspicuously. 

The activity is given by the following expression 
(21) using pixel values of totaling 8 blocks including 4 
blocks of a frame discrete cosine transform mode and 4 
blocks of a field discrete cosine transform mode using 



brightness signal pixel values of the original picture 
act ; = 1 + min (var sblk) 

3 jW*-I,8 ~~ 



64 f _ \2 

P k 'P\ (21) 



J 64 

var sblk = — V 
64 £ 

1 64 

where Pk is the brightness signal intra-block pixel value 
of the original image. The reason why a minimum value is 
taken in the expression (21) is that it is intended to 
use finer quantization where a flat portion is included . 
only at a portion in the macro block. 

Further, a normalized activity Nactj whose value 
ranges from 0.5 to 2 is determined in accordance with the 
following expression (22): 

2 * act . + avg act 

Nact, « J - — (22) 

J act j +2* avg _ act 

where avg-act is the average value of the activity actj of 
the picture coded last. 

A quantization scale code mquantj with a visual 
characteristic taken into consideration is determined in 
accordance with the following expression (23) based on 

the quantization scale code Qj determined in step S102: 
mquantj = Qj * N _ actj (23) 

By the way, as recited in "Theoretical Analysis of 
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the MPEG Compression Efficiency and Application thereof 
to the Code Amount Control", Shingaku Giho, IE-95, DSP95- 
10, May 1995 (hereinafter referred to as document 2), the 
code amount control system defined in the Test Mode 15 
does not always provide a good picture quality in an 
MPEG2 image coding section. 

In document 2, the following system is proposed 
particularly as a technique for providing an optimum code 
amount distribution for each of frames in a GOP to 
provide a good picture quality. 

Where Ni, Np and Nb are the numbers of those I, P 
and B pictures in a GOP which are not coded as yet and 
the code amounts to be applied to them are represented by 
Ri, Rp and Rb, respectively, such a fixed rate condition 
as given by the following expression (24) is satisfied: 
R = Nj Rj + N P R P +N B R B (24) 

Where the quantization step sizes of individual 
frames are represented by Qi, Qp and Qb and m is an order 
number for coordinating a quantization step size and a 
reproduction error variance with each other, that is, if 
it is assumed that minimization of an average of the 
quantization step sizes raised to the mth power minimizes 
the reproduction error variance, then an optimum code 
amount distribution for each frame in the GOP is given by 



CO 
p 



minimizing the expression (25) given below: 

N,Q7+N P Q?+N B QZ 
N,+N P +N B 

It is to be noted that the average scale Q and the 
code amount R of the frames are coordinated with the 
complexity X of each frame as a medium variable used also 
in the Test Model 15 as given by the following expression 
(26) : 

QR a =X (26) 

Accordingly, by calculating such code amounts Ri, Rp 
and Rb as minimize the expression (25) using the 
Lagrange's method of undetermined multipliers taking the 
expression (26) into consideration under the restrictive 
condition of the expression (24) , such values as given by 
the following expressions (27) to (29) are determined as 
optimum code amounts Ri, Rp and Rb, respectively: 

R 



(27) 



(x p 




1+ma 


(x B ) 






+ N B - 






J 













1+ma 



R 



(28) 



N P +N B 



X B y+ma 



[X F 
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Where a » 1 , the expressions (27) to (29) and the 
expressions (8) to (10) given hereinabove in the code 
amount control system defined in the MPEG2 Test Mode 15 
have the following relationship. In particular, from the 
expressions (27) to (29), the parameters K p and Kb for 
code amount control are adaptively calculated in 
accordance with the following expression (30) based on 
the complexities Xi # Xp and Xb of each frame: 



i i 



( y \+m ( Y \l+m 



\ X p J 



(30) 



In document 2, it is disclosed that a good picture 
quality is obtained by setting the value of 1/(1 + m) to 
approximately 0.6 to 1.2. 

However, it is known that the following limitations 
are applied to the code amount control system defined in 
the MPEG2 Test Model 15. 

The first limitation is that step S101 of FIG. 5 
cannot cope with a scene change and therefore the 
parameter avg_act used in step S103 after the scene 
change has a wrong value. The second limitation is that 
it is not guaranteed that the restriction condition of 



the VBV (Video Buffer Verifier) prescribed in the MPEG2 

and the MPEG4 is satisfied. 

Accordingly, when the code amount control is 

performed actually, a countermeasure against the 

limitations is required. 

Further, while the initial value of the reference 

quantization scale for the first I-VOP in the expression 
^ (18) is 10, this initial value is not always an 

appropriate value depending upon the pattern and the bit 
2 rate. Particularly with an image of the SIF or the QSIF, 

since the number of macro blocks is small, a time 

corresponding to several VOPs may possibly be required 

f 

before the feedback loop for code amount control is 
r7 stabilized. Therefore, picture quality deterioration at 

an initial stage of the Video Object may possibly caused 
by an initial value of the reference quantization scale. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to provide 
an image information conversion apparatus and an image 
information conversion method by which picture quality 
deterioration caused by setting of an initial value can 
be prevented when code amount control in MPEG4 image 
coding is performed based on information extracted from 
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MPEG2 image compression information. 

In order to attain the object described above, 
according to an aspect of the present invention, there is 
provided an image information conversion apparatus which 
receives first image compression information as an input 
thereto and outputs second image compression information. 
Each of the first image compression information and the 
second image compression information includes at least 
intra- image coded pictures and inter- image prediction 
coded pictures. The apparatus includes quantization scale 
determination means for using information extracted from 
the first image compression information to determine an 
initial value for a reference quantization scale to be 
used for production of an intra- image coded picture of 
the second image compression information and determining 
an initial value for a virtual buffer occupation amount 
for an intra- image coded picture based on the initial 
value for the reference quantization scale to be used for 
production of the first intra- image coded picture of the 
second image compression information. 

The information extracted from the first image 
compression information may be an average quantization 
scale of the first intra- image coded picture of the first 
image compression information. 

19 



In the image information conversion apparatus, an 
initial value for a reference quantization scale to be 
used for code amount control in MPEG4 coding is 
determined based on MPEG2 image compression information 
of an interlaced scan, and the calculated initial value 
for the reference quantization scale is used to calculate 
an initial value for a virtual buffer occupation amount. 
Consequently, an MPEG4 bit stream of a progressive scan 
can be outputted while image deterioration caused by 
setting of an initial value for the reference 
quantization scale code is prevented. 

According to another aspect of the present 
invention, there is provided an image information 
conversion method for receiving first image compression 
information as an input thereto and outputting second 
image compression information, each of the first image 
compression information and the second image compression 
information including at least intra- image coded pictures 
and inter-image predictive coded pictures, the method 
comprising the steps of using information extracted from 
the first image compression information to determine an 
initial value for a reference quantization scale to be 
used for production of an intra -image coded picture of 
the second image compression information, and determining 



an initial value for a virtual buffer occupation amount 
for an intra- image coded picture based on the initial 
value for the reference quantization scale to be used for 
production of the first intra- image coded picture of the 
second image compression information. 

The information extracted from the first image 
compression information may be an average quantization 
scale of the first intra- image coded picture of the first 
image compression information. 

In the image information conversion method, an 
initial value for a reference quantization scale to be 
used for code amount control in MPEG4 coding is 
determined based on MPEG2 image compression information 
of an interlaced scan, and the calculated initial value 
for the reference quantization scale is used to calculate 
an initial value for a virtual buffer occupation amount. 
Consequently, an MPEG4 bit stream of a progressive scan 
can be outputted while image deterioration caused by 
setting of an initial value for the reference 
quantization scale code is prevented. 

The above and other objects, features and 
advantages of the present invention will become apparent 
from the following description and the appended claims, 
taken in conjunction with the accompanying drawings in 



which like parts or elements denoted by like reference 
symbols. 



BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram showing a configuration 
of an image information conversion apparatus to which the 
present invention is applied; 

FIG. 2 is a flow chart illustrating operation of 

%Q 

CO the image information conversion apparatus of FIG. 1 when 

in 

w r it converts image information; 

Lij 

— FIG. 3 is a block diagram showing a configuration 

s 

f ** of another image information conversion apparatus to 

which the present invention is applied; 
^ FIG. 4 is a block diagram showing a configuration 

of a conventional image information conversion apparatus; 

and 

FIG. 5 is a flow chart illustrating a process of an 
MPEG4 image information coding section of the image 
information conversion apparatus of FIG. 4 which performs 
code amount control using a complexity of each frame 
extracted by an MPEG2 image information decoding section. 



DESCRIPTION OF THE PREFERRED EMBODIMENTS 

An image information conversion apparatus according 
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to the present invention calculates an initial value for 
a reference quantization scale to be used for MPEG4 image 
coding based on information extracted from inputted MPEG2 
image compression information and uses the initial value 
to calculate an initial value for a virtual buffer 
occupation amount thereby to prevent picture quality 
deterioration caused by an inappropriate value of the 
reference quantization scale. 

Referring to FIG. 1, there is shown an image 
information conversion apparatus to which the present 
invention is applied. The image information conversion 
apparatus 1 shown includes a picture type discrimination 
section 10, a compression information analysis section 11 
a MPEG2 image information decoding section 12, a 
reduction section 13, a video memory 14, an MPEG4 image 
information (I/P-VOP) coding section 15, a motion vector 
synthesis section 16, a motion vector detection section 
17, an information buffer 18, and a complexity 
calculation section 19 . 

The picture type discrimination section 10 receives 
data of frames of MPEG2 image compression information of 
an interlaced scan (hereinafter referred to as MPEG2 bit 
stream) as an input thereto and discriminates of which 
one of an intra-image coded picture (hereinafter referred 



to as I picture), an inter - frame predictive coded picture 
(hereinafter referred to as P picture) and a bi- 
directionally predicted coded picture (hereinafter 
referred to as B picture) the data of each frame is. The 
picture type discrimination section 10 transmits 
information regarding I pictures and P pictures 
(hereinafter referred to as I/P pictures) to the MPEG2 
image information decoding section 12 but discards 
information regarding B pictures. 

The compression information analysis section 11 
analyzes a average value Q over an entire frame of the 
quantization scale used for decoding processing and a 
total code amount (bit number) B allocated to the frame 
in the MPEG2 bit stream and sends necessary information 
to the information buffer 18. 

The information buffer 18 stores such generated 
code amounts (bit numbers) and average quantization 
scales of I/P pictures of the MPEG2 bit stream. 

The complexity calculation section 19 calculates an 
estimated value of the complexity X for each VOP of MPEG4 
image compression information (hereinafter referred to as 
MPEG4 bit stream) from the information Q and B of each 
frame stored in the information buffer 18 in accordance 
with the expression (20) given hereinabove. 



The MPEG2 image information decoding section 12 
performs decoding processing of information regarding I/P 
pictures of the MPEG2 bit stream. While the MPEG2 image 
information decoding section 12 is similar to an ordinary 
MPEG2 image information decoding section, since data 
regarding B pictures is discarded by the picture type 
discrimination section 10, it is required that the MPEG2 
image information decoding section 12 can decode at least 
I/P pictures. 

The reduction section 13 receives pixel values as 
an input thereto from the MPEG2 image information 
decoding section 12, performs a reduction process to 1/2 
in the horizontal direction for the pixel values and then 
performs a process of discarding data of only one of the 
first field and the second field in the vertical 
direction while leaving data of the other field thereby 
to produce an image of a progressive scan having a size 
of 1/4 that of the inputted image information. 

If the MPEG2 bit stream inputted from the MPEG2 
image information decoding section 12 represents images 
conforming with the standards of the NTSC (National 
Television System Committee), that is, interleaved scan 
images of 30 Hz of 720 X 480 pixels, then the picture 
size after the reduction processing by the reduction 



section 13 is 360 X 240 pixels. However, in order to 
allow processing to be performed in a unit of a macro 
block when coding is performed by the MPEG4 image 
information coding section 15 in a following stage, both 
of the numbers of pixels of the image in the horizontal 
and vertical directions must be multiples of 16. 
Accordingly, the reduction section 13 further performs 
supplementation or discarding of pixels to satisfy the 
requirement. In particular, in the case described above, 
for example, 8 lines at the right end or the left end in 
the horizontal direction are discarded to produce an 
image of 352 X 240 pixels. Here, MPEG4 image information 
is referred to as I/P-VOP. The VOP (Video Object Plane) 
corresponds to a frame in the MPEG2 system. 

The pictures of a progressive scan produced by the 
reduction section 13 are stored into the video memory 14 
and then undergo coding processing by the MPEG4 image 
information coding section 15, and consequently are 
outputted as an MPEG4 bit stream. 

Motion vector information in the input MPEG2 bit 
stream is supplied to the motion vector synthesis section 
16 and mapped to motion vectors of the image information 
after the reduction. 

The motion vector detection section 17 detects 

26 





fere? 




motion vectors of high accuracy based on the motion 
vector values synthesized by the motion vector synthesis 
section 16 . 



produces an MPEG4 bit stream of images of a progressive 
scan having a size of 1/2 X 1/2 of the inputted MPEG2 bit 
stream. In particular, if the input MPEG2 bit stream 
complies with the NTSC standards, then the MPEG4 bit 
stream outputted has the SIF size (352 X 240). The image 
information conversion apparatus 1 can change the 
operation of the reduction section 13 to convert ^the 
input MPEG2 bit stream into images of any other image 
size, for example, in the example described above, into 
images of the QSIF (176 X 112 pixels) which is an image 
size of approximately 1/4 X 1/4. 



1 performs, as processing by the MPEG2 image information 
decoding section 12, a decoding process using all of 
eighth-order discrete cosine transform coefficients in 
the inputted MPEG2 bit stream in both of the horizontal 
and vertical directions and a decoding process using only 
low frequency components of eighth-order discrete cosine 
transform coefficients only in the horizontal direction 
or in both of the horizontal and vertical directions 



The image information 



conversion apparatus 1 



Further, the image 



information conversion apparatus 



27 



thereby to reduce the arithmetic operation amount and the 
video memory capacity involved in the decoding processing 
while suppressing the picture quality deterioration to 
the minimum. 

The average value Q over the entire frame of the 
quantization scale used for the decoding processing by 
the compression information analysis section 11 and the 
total code amount (bit number) B allocated to the frame 
in the MPEG2 bit stream are stored into the information 
buffer 18. 

The complexity calculation section 19 calculates 
the complexity X of each frame stored in the information 
buffer 18 from the information Q and B for the frame in 
accordance with the following expression (31) : 
X=QB (31) 

The complexities X of the frames calculated in 
accordance with the expression (31) above are buffered 
for one GOV and then sent as a parameter for code amount 
control to the MPEG4 image information coding section 15. 
Therefore, a delay for one GOV is required. This delay is 
implemented using a delay buffer. 

In the following, description is given of in what 
manner the complexity X of each frame in the GOV 
calculated in accordance with the expression (31) is used 
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by the MPEG4 image information coding section 15. It is 
to be noted that, in the following description, also a 
case where the apparatus does not include the picture 
type discrimination section 10 and does not perform 
conversion of the frame rate is taken into consideration. 

The parameters K p and K b determined in accordance 
with the expression (30) represent that the ratios of 

azss. 
\t -J 

.Jfk ideal average quantization scales Qp ideai and Qb_ideai for a 

IQ P-VOP/B-VOP to an ideal average quantization scale Qi ideai 



ffi 



for an I -VOP are given by the following expression (32) : 



Qp_ideai _ „ t Qb_ideaI _ „ , 

r. ~q p, o b 1 ] 

Es s~ i _ ideal _ ideal 

p In the MPEG2 Test Mode 15, the parameters K p and K b 

C3 are not calculated adaptively as in the expression (30) , 

s 

but such fixed values as given by the expression (7) are 
used therefor. 

From the expressions (30) and (32), where the 
complexities of an arbitrary VOP 1 and another arbitrary 
VOP 2 are represented by Xi and X 2 and the ideal 
quantization scales are represented by Qi ideai and Q2_ideai, 
respectively, then the following expression (33) is 
obtained : 



ideal ( X ^ 
Q\ ideal 



1 

X 



2 



l+m 



= «"(X,,X 2 ) (33) 
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However, where it is desired to use fixed values a 
given by the expression (7) as in the MPEG2 Test Mode 15, 
the following expression (34) should be used in place of 
the expression (33) above: 



K p (l=>I- VOP,2 = P- VOP) 
K b (l = I - VOP, 2 = B - VOP) 



K(X X ,X 2 )=\—±-(l = P- VOP, ,2 = B - VOP) 



(34) 



(l = B - VOP, 2 -P- VOP) 



1 (when 1 and 2 are the same type of VOP) 
Here, it is assumed that, where the total code 
amount (bit number) allocated to non - coded VOPs in a GOV 
is represented by R, when the total code amount R is 
allocated as R lf R 2 , ~, R n to the VOPs, the picture 
quality of the GOV is optimized. In this instance, the 
relational expression given as the following expression 
(35) is satisfied by the total code amount R and the 
allocated code amounts R 1# R 2 , R n : 

R = R^R 2 -^ + /?„ (35) 

Among the average quantization scale Q k , allocated 
code amount R k and complexity X k of an arbitrary VOP k , the 
relationship represented by the following expression (36) 
is satisfied: 

X k =Q k R k (36) 
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Here, by transforming the expression (35) taking 
the expression (36) into consideration, the following 
expression (37) is obtained: 

R R R 

1 + *2+ + Rn l + + !k 

R 

R 

= 1 + . 1 X - 1 



(37) 



*(X„X 2 ) X, KiX^Xj X, 

cn 

= p Although the value obtained by the expression (33) 

Isj 

tfi or the value obtained by the expression (34) may be used 

s 

[»* for K(Xi, X 2 ) in the expression (37), use of the former 

can achieve a more optimum code amount distribution 
suitable for an image. 

Thereupon, if the value of 1/(1 + m) is set to 1.0, 
then the necessity for exponential operation is 
eliminated, and consequently, high speed execution can be 
achieved. Further, even where the value of 1/(1 + m) is 
set to a value other than 1.0, high speed execution can 
be achieved if a table is prepared in advance and 
referred to to perform exponential operation. 

While the complexity X k of each VOP according to 
the expression (37) is obtained by MPEG4 image coding, if 
it is assumed that the complexity of each frame by MPEG2 
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image coding and the complexity of each frame by MPEG4 
image coding are equal to each other, then if the 
complexity X k stored in the complexity calculation section 
19 is used, then a target code amount for the VOP can be 
calculated in accordance with the expression (37). 

FIG. 2 illustrates a processing flow when the image 
information conversion apparatus 1 calculates a target 
code amount. 

Referring to FIG. 2, first in step SI, the MPEG2 
image information decoding section 12 extracts the 
average value Q and the total code amount B (bit amount) 
of each frame in a GOP. 

In step S2, the complexity calculation section 19 
calculates the complexity X. 

Then in step S3, the MPEG4 image information coding 
section 15 calculates a target code amount (target bit 
rate) based on the complexity X. 

While the MPEG2 Text Mode 15 assumes that the 
complexities X i# X p and X b of I, P and B pictures in a GOP 
are fixed, this assumption is not satisfied in such a 
case that the GOP includes a scene change or the 
background exhibits a remarkable variation in the GOP, 
but disturbs stabilized code amount control and makes a 
cause of picture quality deterioration. With the image 



information conversion apparatus 1 shown in FIG. 1, since 
code amount control is based on the complexity of each 
frame of the inputted MPEG2 bit stream, stabilized code 
amount control can be anticipated without causing picture 
quality deterioration . 

Now, another image information conversion apparatus 
to which the present invention is applied is described in 
detail with reference to FIG. 3. 

The image information conversion apparatus 2 shown 
in FIG. 3 includes a picture type discrimination section 
20, a compression image analysis section 21, an MPEG2 
image information decoding section 22, a reduction 
section 23, a video memory 24, an MPEG4 image information 
coding section 25, a motion vector synthesis section 26, 
a motion vector detection section 27, an information 
buffer 28, a complexity calculation section 29, and an 
initial reference quantization scale determination 
section 30. 

The picture type discrimination section 20 receives 
data of frames of MPEG2 image compression information of 
an interlaced scan (hereinafter referred to as MPEG2 bit 
stream) and discriminates whether data of each frame 
relate to MPEG2 image information (hereinafter referred 
to as I/P picture) or a B picture. The picture type 



discrimination section 20 sends information regarding an 
I/P picture to the compression image analysis section 21, 
but does not send information regarding a B picture. 

The compression image analysis section 21 analyzes 
an average value Q over an entire frame of the 
quantization scale used for decoding processing and a 
total code amount (bit number) B allocated to the frame 
in the MPEG2 bit stream, and sends necessary information 
to the information buffer 28. 

The information buffer 28 stores generated code 
amounts (bit numbers) and average quantization scales of 
I/P pictures of the MPEG2 bit stream. 

The complexity calculation section 29 calculates an 
estimated value of the complexity X of each VOP of MPEG4 
image compression information (hereinafter referred to as 
MPEG4 bit stream) from the information Q and B of the 
frames stored in the information buffer 28. 

The MPEG2 image information decoding section 22 
performs decoding processing of information regarding I/P 
pictures of the MPEG2 bit stream. Although the MPEG2 
image information decoding section 22 is similar to an 
ordinary MPEG2 image information decoding section, since 
data regarding B pictures are discarded by the picture 
type discrimination section 20, it is required that it 



can at least decode I/P pictures. 

The reduction section 23 receives pixel values from 
the MPEG2 image information decoding section 22, performs 
a reduction process to 1/2 for the pixel values in the 
horizontal direction, and performs a process of 
discarding either one of the first field and the second 
field in the vertical direction while leaving the other 
field thereby to produce images of a progressive scan 
having a size equal to 1/4 that of the inputted image 
information . 

If the MPEG2 bit stream inputted from the MPEG2 
image information decoding section 22 represents, for 
example, images complying with the NTSC (National 
Television System Committee) standards, that is, 
interlaced scan images of 30 Hz and 720 X 480 pixels, the 
image size after the reduction processing by the 
reduction section 23 is 360 X 240 pixels. However, in 
order to allow processing to be performed in a unit of a 
macro block when the MPEG4 image information coding 
section 15 in a following stage performs coding, the 
pixel numbers in both of the horizontal and vertical 
directions must be multiples of 16. Accordingly, in the 
case described above, eight lines at the right end or the 
left end in the horizontal direction are discarded so 



that the image size may be 352 X 240 pixels. Here, the 
MPEG4 image information is referred to as I/P-VOP. The 
VOP (Video Object Plane) corresponds to a frame of the 
MPEG2 . 

Images of a progressively scan produced by the 
reduction section 23 are stored into the video memory 14 
and then undergo coding processing by the MPEG4 image 
information coding section 25, and consequently are 
outputted as an MPEG4 bit stream. 

Motion vector information in the inputted MPEG2 bit 
stream is supplied to the motion vector synthesis section 
26, by which it is mapped to motion vectors of the image 
information after the reduction. 

The motion vector detection section 27 detects 
motion vectors of a high degree of accuracy based on the 
motion vector values synthesized by the motion vector 
synthesis section 26. 

The image information conversion apparatus 2 
produces an MPEG4 bit stream of progressive scan images 
having a size of 1/2 X 1/2 of an inputted MPEG2 bit 
stream. In particular, where the inputted MPEG2 bit 
stream complies with, for example, the NTSC standards, 
the MPEG4 bit stream outputted has the SIF size (352 X 
240 pixels) . The image information conversion apparatus 2 



can perform conversion into images of any other image 
size, for example, in the example described above, images 
of the QSIF (176 X 112 pixels) size which is an image 
size of approximately 1/4 X 1/4 by modifying the 
operation of the reduction section 13. 

Further, the image information conversion apparatus 
2 not only performs, as processing by the MPEG2 image 
information decoding section 22, a decoding process which 
uses all of eighth-order discrete cosine transform 
coefficients in the inputted MPEG2 bit stream for both of 
the horizontal and vertical directions but also performs 
another decoding process which uses only low- frequency 
components from among eighth-order discrete cosine 
transform coefficients only for the horizontal direction 
or for both of the horizontal and vertical directions 
thereby to reduce the arithmetic operation amount and the 
video memory capacity involved in decoding processing 
while suppressing the picture quality deterioration to 
the minimum. 

The initial reference quantization scale 
determination section 30 first determines, from the 
numbers of macro blocks included in an MPEG2 bit stream 
and an MPEG4 bit stream determined in advance, the code 
amount (bit number) allocated to the first I picture of 



the MPEG2 bit stream stored in the information buffer 28, 
the average quantization scales Q M peg2,io and the target 
code amount (target bit) of the first I-VOP of the MPEG4 
bit stream calculated by the MPEG4 image information 
coding section 25, an initial value for the reference 
quantization scale, and calculates an initial value for 
the virtual buffer occupation amount. 



The image information conversion apparatus 2 having 



such a configuration as described above determines the 
initial value refQio of the reference quantization scale 
of the first I-VOP of the MPEG4 bit stream to be 
outputted . 



and the frame rates of the MPEG2 bit stream inputted to 
the image information conversion apparatus 2 and the 
MPEG4 bit stream to be outputted from the image 
information conversion apparatus 2 are represented by 
bit_rate MP BG2* bit_rate M PEG4 and f rame_rate MPEG 2, 
frame_rate MPBG4 , respectively, the initial value refQio for 
the reference quantization scale is represented by the 
following expression (38) : 



According to the first method, where the bit rates 




MPEG 2 



frame _ rate 



MPEG 4 



■a 



(38) 



frame _ rate 



MPEG2JO 



MPEG 4 



MPEG 2 



The reason why 1/2 is used as a coefficient in the 
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expression (38) above is that the quantization scale code 
62 of the MPEG2 corresponds to the MPEG4 quantization 
scale code 31. 

According to the second method, where the code 
amount (bit number) allocated to the first I picture of 
the inputted MPEG2 bit stream is represented by B MPEG2 ,io, 
the target code amount (target bit) of the first I-VOP of 
the first MPEG4 bit stream to be outputted, which is 
calculated in accordance with the expressions (8) to (10) 
or the expression (37), is represented by T I0 , and the 
quantities of macro blocks included in one frame of the 
inputted MPEG2 bit stream and macro blocks included in 
one VOP of the MPEG4 bit stream to be outputted are 
represented by MB c 

n tMPG2 and MB cntMPEG4# respectively, the 
initial value refQ I0 is represented in accordance with the 
following expression (39) : 

2 T iQ MB_cnt MPEG2 

In the expressions (38) and (39) given above, the 
quantization scale in MPEG4 coding cannot assume any 
other value than integers from 1 to 31. Therefore, one of 
the integers from 1 to 31 which is nearest to the initial 
value refQ I0 calculated in accordance with the expression 
(38) or (39) is used as the initial value refQio which is 



used in the later processing. 

The initial vale d 0 x for the virtual occupation 
amount of an I-VOP is determined using the following 
expression (40) : 

,40, 

r 

In the following, also a case wherein the picture 
^ type discrimination section 20 is used to discard B 

25? pictures but conversion of the frame rate is not 

fn 

performed is taken into consideration. At this time, the 
m initial values d 0 p and d 0 b of the virtual buffer occupation 

j\ amounts for P/B-VOPs may be calculated using any of the 

}«% following methods. 

According to the first method, where the ratios K p 
and K b are constants given by the expression (7) above, 
the initial values d 0 p and d 0 b are calculated in accordance 
with the expression (41) given below using the initial 
value do 1 determined using the expression (40) : 
dl = K p d' 0 ;d b 0 =K b d' 0 (41) 

According to the second method, similarly to the 
expression (38) , the initial values refQ P0 and refQ B0 are 
calculated in accordance with the following expressions 
(42) and (43) ; respectively: 



4 0 



'ass? 



fen? 



^ 1 bit rate MPFr7 frame rate MPPrd _ 

r/3 fO = . = MPEGJ . ± — MPEG4 . ft (A?) 

2 bit_rate MPEG4 frame _ rate MPEG2 



^ 1 for rate frame rate MPfrrd _ 

re/O = — • = MPEG2 . ^ = MPEG 4 .ft (A3) 

1 tJ\S BO ^ MPEG 2, BO l^-W 

2 bit _rate MPEGA frame _ rate MPEG2 



Alternatively, similarly to the expression (39), 
the initial values refQ P0 and refQ B o are calculated in 
accordance with the following expressions (44) and (45); 
%Q respectively: 



I refO =l. B ^^o . MB_cnt MPEG4 

*L rt J<*S p0 , /n ^MPEG2,PO K 1 *** 

-F 2 T po MB_cnt MPEG2 

W 
f n 

s „„/r> _ 1 B mpeg2,bo MB_cnt MPEG4 

; . ^/Gbo - T ~ "Ztij Qmpeg 2,bo (45) 

!=<* 2 r BO MB_cnt MPEG2 

Saw? 

CO 



Using the initial values refQ P0 and refQ B o/ the 
initial values d 0 P and d 0 b for the virtual buffer 
occupation amounts are calculated in accordance with the 
following expressions (46) and (47), respectively: 

d ,,r_^21 (46) 



d b = J^bo (4?) 

r 

As described in detail above, the image information 
conversion apparatus 2 can prevent picture quality 
deterioration arising from the fact that the reference 
quantization scale has an inappropriate value because the 
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initial reference quantization scale determination 
section 30 calculates an initial value for a reference 
quantization when MPEG4 image coding is performed based 
on information extracted from an inputted MPEG2 bit 
stream and then calculates an initial value for a virtual 
buffer occupation amount. 

It is to be noted that, when an initial value for a 

C3 

k f\ reference quantization scale is to be determined, the 

initial reference quantization scale determination 
«p section 30 may determine it from the average quantization 

u 

EH scale code for the first I picture of the MPEG2 bit 
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stream stored in the information buffer and the frame 
rates and the bit rates of the MPEG2 bit stream and the 
MPEG4 bit stream. 

Further, while, in the foregoing description, an 
MPEG2 bit stream is inputted and an MPEG4 bit stream is 
outputted, the input and the output are not limited to 
them, and the image compression information may be image 
compression information of, for example, the MPEG1 or the 
H. 263 . 

While preferred embodiments of the present 
invention have been described using specific terms, such 
description is for illustrative purposes only, and it is 
to be understood that changes and variations may be made 
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without departing from the spirit or scope of the 
following claims. 
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